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ABSTBACT 

This thesis presents a design and partial implementation 
cf a program family cf extended pretty printers. Factors 
that influence the readability (perception) and understand- 
ability (cognition) of computer programs are indemnified, 
previous work is reviewed, and new solutions are suggested. 
Extensions to the previous pretty printer designs include a 
capability to selectively display levels of control cf a 
program. In order to accommodate different computer 
languages and to allow for several secondary functions, a 
family of pretty printers is designed. This design facili- 
tates easy extension, contraction and modification. 
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I. INTRODUCTION 



Programs are written to be read and understood by 
people. The textual representation should be such that it 
is easy to read. That is, the representation of the program 
should be such that it reduces the visual burden on the user 
and allows him to develop and exploit visual clues to aid in 
reading. In addition, the text of the program should be 
designed so that it is easy for the reader to grasp the 
meaning cf the program: that is, the representation of the 

program should help the reader understand the program. 

Fifteen years age Dijkstra argued that "... our intel- 
lectual powers are rather geared to master static relations 
and cur powers to visualize processes evolving in time are 
relatively poorly developed. For that reason we should do 
(as wise programmers aware of our limitations) our utmost to 
shorten the conceptual gap between the static program and 
the dynamic process, to make the correspondence between the 
program (spread out in text space) and the process (spread 
out in time) as trivial as possible." [Ref. 16], There is 
an additional conceptual gap between the program spread out 
in text and how we represent and manipulate the static 
program and its dynamic process in our minds. Here also we 
should try tc narrow the conceptual gap so that the pregram 
is easy tc read and tc understand. 

In the computer science literature, readability and 
under st andability are often used int erchangably . Readability 
is related to physical conditions, for instance, the size, 
type font, color, and clarity of characters, proper indenta- 
tions, and the spacing between lines. Under standability is 
related tc psychological conditions, for instance, pattern, 
memory, logic, and repetition learnings. Precisely speaking. 
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readability means good perception and underst andabilty means 
good cognition. The system that will be designed in this 
thesis will seek to improve both readability and understand- 
ability by means of reformatting computer programs and 
presenting the user with alternative representations tc aid 
understanding. 

There is evidence to show that readability and under- 
standability of computer programs is an important issue that 
is directly related to programmer productivity. Although 
this has been recognized for seme time, further improvements 
in the textual representation of computer programs are 
possible. This thesis will review the previous work, analyze 
the remaining problems, and propose new solutions. 
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II. E XTEND ED PRETTY P RIN TER 



A. BACKGROUND 

In a study of coulter ica 1 programming practices, Elshoff 
[Ref. 5] found that most programs were poorly written. They 
were very large, extremely difficult to read and understand, 
and mere complex than necessary. Furthermore, the study 
determined that programming language usage was poor and 
inconsistent. The results of the survey by Lientz [Ref. 6] 
show that the quality of programming is a generally 
perceived problem. There has been a major effort to improve 
programming practices. But there still exist many programs 
that are difficult tc read and understand and yet they must 
regularly be corrected and/or modified. 

There are many factors connected with the readability 
and unaerst andabilit y of a computer program. The reader's 
familiarity with the program, knowledge of the application 
area, and own programming style are important factors that 
are mostly independent of the program [Ref. 4], This thesis 
is concentrated cn the representation of program text tha^ 
impacts its readability and underst andabilit v. A readable 
program always seems to exhibit a common set of properties 
[Ref. 8] [Ref. 9] [Ref. 10]. The program is well commented. 
The logical structure of the program is constructed from a 
common set of single-entry single-exit flow of control 
units. Variable names are mnemonic and references to them 
localized. The program's physical layout makes the salient 
features of the algorithm that is implemented stand out 
[Ref. 14]. 
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Since abstraction is an important mechanism that people 
use to understand pregrams, the suppression of details in a 
program can aid understanding. Modern design methodologies 
include tep down design using stepwise refinement. In this 
methodology, the programmer designs successive levels of the 
program. These levels are visible during the design but are 
often not visible in the final program. The understand- 
ability of a program can be improved by making the levels of 
the program structure visible. It is true that a program 
may have all these properties and still be unreadable and 
not understandable; however, the readability and under- 
standability of a pregram are certain to suffer when it 
lacks one or more of the the properties [Ref. 14]. 

B. DEFINITION 

Rubin [Ref. 14] defined a pretty printer as follows: 
"It is a software tool to format programs to make them 
easier to read and understand." The extended pretty printer 
can be defined as: a software tool to improve readability 

and under standabil ity by adding level documentation, 
commenting and reformatting. These additional extensions to 
pretty printers will aid people in understanding the program 
by making more visible the structure of the program and 
supporting the viewing of the levels of the program. The 
primary function of an extended pretty printer is to add 
some level documentation and comments, to insert spaces and 
linefeeds between tokens - character strings - and to decide 
where and hew to break lines that are too long to fit on the 
output medium. 
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C. GCALS 

The methods for improving the readability and under- 
standability of a program use a set of specific transforma- 
tions that can be applied to the program text. The 
following program t ranforraa tions can be done by an extended 
pretty printer. 

1 . fief orm at 

The consistent formating of programs is very impor- 
tant. Elshoff [Sef. 14] said "Just as paragraphing and 
sectioning help written English, so can indentation, 
keyword positioning, and logical grouping aid a programming 
language.". Those jobs can be done automatically by a 
pretty printer. It will allow the program to be read more 
easily. 

2 • A dd Level St ructur e Docum entation 

In writing about his experiments on program compre- 
hension, Shneiderman [Hef. 17] said "Instead of absorbing 
the program on a character -by-character basis, programmers 
recognize the function of groups of statements and then 
piece together these chunks to form ever larger chunks until 
the entire program is comprehended." This experiment 
suggests that the level documentation (chunks) of a program 
will help the unaerstandabi lity of the program. 

2 • Standar dization 

Standardization contributes un derstandab ility of a 
program. To understand this, it is helpful to know the 
source of the expert programming's capacity. The primary 
piece of direct behavioral evidence for this is 

Shneiderman 's replication [Hef. 26] for programming of Chase 
and Simon's classic study on memory for chess position 
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[Bef. 27]. In both these studies, the experts in a parti- 
cular domain could memorize information from that domain 
(i.e. a program or a chess position) far better than 
novices, provided that the information was appropriately 
structured. If the structure was made random (by shuffling 
the statements of the program or rearranging the chess 
pieces), the advantage of the expert would be greatly 
reduced. That means that the expert has no better memory 
than the novice, but rather an elaborate knowledge structure 
in terms of which correspondingly structured items can be 
very efficiently encoded [Ref. 15]. 

If this result is extended to programming, it 
suggests that the expert programmer gets his better know- 
ledge of programs from visible program structure. As noted 
above, if the textual representation is not structured (e.g. 
random) , the expert programmer will lose part of his capa- 
bility. People understand something better when they can- 
integrate it with what they already know. From this view, 
standardization helps people to understand other people's 
programs more quickly. The visual cues are important in 
order to unburden the program reader. The final objectives 
of computer program standards are to ensure consistency, 
reduce program development and testing rime, improve main- 
tainability of programs, and improve changeability of 
programs [Ref. 12]. Programming standards are not intended 
to stifle the imagination of programmers. Experiments of 
Godfrey [Ref. 12] have shown that standards simply remove 
the drudgery of coding and allow programmers to concentrate 
more on the problem at hand. It should be noted that the 
estabishment of standards is a costly process. It should be 
kept in mind that programming standards are not a panacea 
for eliminating all poorly written programs. Adherence to 
these standards will not automatically produce 'good' code 
[Ref. 12]. 
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There are multiple levels of understanding a 
program. It is possible to follow each line of code without 
understanding the overall program function. It may also be 
possible to understand the program function but not under- 
stand each of the steps. There is also a middle level of 
understanding concerning control structures, module design, 
and data structures [Ref. 17]. Skimming for a top down view 
is to suppress detail until the overall program is under- 
stood. Then the program is read selectively and understood 
in more detail. 



4 . Exa mple 



The following example will show how the reformat- 
ting, level structured documentation, and the standardiza- 
tion help the readability and understandability of a 
program . 

The bubble-scrt algorithm will be introduced for 
this example [Ref. 18]. The idea of the bubble sort is as 
follows: "We go through a list comparing adjacent items and 

exchanging those that are out of order. During such a 
compare-and-exchange pass, an item moves forward in the list 
until it 'bumps up against* a larger item." [Ref. 18]. An 
algorithm language [Ref. 18] and structured FORTRAN will be 
used for this example. 



THE ALGORITHM FOF BOEELE_SORT : 

ALGORITHM BUBBLE SORT 
INPUT N 

INPUT LIST (1 : N) 

REPEAT 

NO-EXCHANGES < — TRUE 
FOR I < — 1 TO N - 1 DO 

IF LIST (I) > LIST (1+1) THEN 

TEMP <-- LIST(I) 

LIST (I) < — LIS T (I + 1 ) 

LIST (I + 1 ) < — TEMP 

NO-EXCHANGES < — FALSE 
END IF 
END FOR 

UNTIL NO-EXCHANGES 
OUTPUT LIST ( 1 : N) 

END BUBBLE-SORT 
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UNFORMATTED FORTRAN PROGRAM FOR 8UBBLE_SORT : 

INTEGER LIST (100) ,I,N,TEMP 
LOGICAL NOEXG 
READ (5,100) N 
100 FORMAT (15) 

READ (5, 1 0 0) LIST 
5 CONTINUE 

NOEXG = .TRUE. 

DO 777 1= 1 , N-1 

IF (. NOT. (LIST(I) . GT. LIST (1 + 1) ) GOTO 10 
TEMP = LIST(I) 

LIST (I) = LIST (1+1) 

LIST (1 + 1) = TEMP 

NOEXG = .FALSE. 

10 CONTINUE 
777 CONTINUE 

IFj. NOT. NOEXG) GO TO 5 
WRITE ( 6 ,200) LIST 
200 FORMAT (IX, 17) 

STOP 

END 



The following shows some of the possible outputs of 
an extended pretty printer. Indentation is used to improve 
readability. Selective display of the levels of the control 
structure of the program both in FORTRAN and in a general- 
ized programming language is used to support improved under- 
standability . The reader selects the textual representation 
that best supports his current perceptual and cognitive 
needs . 



LEVEL I A : 



INTEGER 

LOGICAL 



LIST (100) , I, 
NOEXG 



READ (5,100) 
READ (5 ,100) 



N 

LIST 

REPEAT 

COMPOUND STATEMENT 
UNTIL NOT NOEXG 
WRITE (6,200) LIST 



STOP 



100 FORMAT (15) 

200 FORMAT (IX, 17) 
END 



TEMP 



LEVEL IE : 

DECLARATION 

DECLARATION 

SIMPLE STATEMENT 
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SIMPLE STATEMENT 
REPEAT 

COMPODND STATEMENT 
ENDUNTIL 

SIMPLE STATEMENT 

STOP 

END 



Here 
read 
stru 
desi 
sale 
co mp 



This shows the first level of bubble sort pro 

:ion is represented, so 
see simply the highest 
.nd can understand the ov 
easily. The reader can 
iow additional level: 
leted program is displayed. 
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er all 
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LEVEL II A : 



INTEGER LIST (100), I, N, TEMP 
LOGICAL NOEXG 

READ (5,10 0) N 
READj5,100) LIST 
5 CONTINUE 

NOEXG = .TRUE. 

FOR I = 1 TO N - 1 

COMPOUND STATEMENT 
ENDFOR 

IF (. NOT. NOEXG) GO TO 5 
WRITE (6,200) LIST 

STOP 

100 FORMAT (15) 

200 FORMAT (IX, 17) 

END 



LEVEL HE : 

DECLARATION 

DECLARATION 

SIMPLE STATEMENT 
SIMPLE STATEMENT 
REPEAT 

SIMPLE STATEMENT 
DC FOR 

COMPOUND STATEMENT 
ENDFOR 
ENDREPEAT 
SIMPLE STATEMENT 

STOP 

END 
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LEVEL III A : 



5 



777 



100 

200 



INTEGER 

LOGICAL 



LIST (10 0) , 
NCEXG 



I, N , TEMP 



,!Q0j 



N 

LIST 



READ (5 . 

READ (5,100 
CONTINUE 

NOEXG = 

DO 777 I = 1 
IF LIST 
CO 

ENDIF 

CONTINUE 

IF (. NOT. NOEXG) GO 
WRITE (6, 200) LIST 

STOP 

FORMAT (15) 

FORMAT IX, 17) 

END 
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TRU E. 

, N - 

(I) > LIST (1+ 1) 

MPOUND STATEMEN 



TO 5 



THEN 

T 



LEVEL HIE. 

DECLARATION 
DECLARATIO N 

SIMPLE STATEMENT 
SIMPLE STATEMENT 
REPEAT 

SIMPLE STATEMENT 
DC FOB 

IF CONDITION THEN 

COMPOUND STATEMENT 

ENDIF 
ENDFOR 
ENDREPEAT 
SIMPLE STATEMENT 

STOP 

END 



For most experienced programmers who are familiar 
with top down design with stepwise refinement, the following 
representations are easier to read and understand than rhe 
intial pregrams. 



FINAL SOURCE PROGRAM : 

INTEGER LIST (100) , I, N, TEMP 
LOGICAL NOEXG 
C 

READ (5 ,10 0) N 
READ (5 ,10 0) LIST 
5 CONTINUE 

NCEXG = .TRUE. 

DO 777 1= 1 , N- 1 

IF(. NOT. (LIST(I) .GT. LIST (1+1) ) GO TO 10 
TEMP = LIST (I) 
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LIST 
LIST 
NC EX 




10 



CONTINUE 



111 CONTINUE 

IF (. NOT. NOEXG) GO TO 5 
WRITE (6,200) LIST 

STOP 

100 FORMAT (15) 

200 FORMAT 1 X , 17) 

END 



FINAL STRUCTURE DOCUMENTATION : 



DECLARATION 

DECLARATION 

SIMPLE STATEMENT 
SIMPLE STAIEMENT 
REPEAT 

SIMPLE STATEMENT 
DO FOR 

IF CONDITION THEN 



ENDIF 
ENDFOR 
ENDREPEAT 
SIMPLE STAIEMENT 

STOP 

END 



SIMPLE STATEMENT 
SIMPLE STATEMENT 
SIMPLE STATEMENT 
SI MPLE STATEMENT 



20 



III. SOME APPROACHES AND VARIOUS OBJECTS 



A. SCHE APPROACHES 

There have been many attempts to improve 
understandabilit y and readability. The following are typical 
examples. 

1 . Neater 2 

Neater2 accepts a PL/I source program and operates 
on it to produce a reformatted version. When in the LOGICAL 
mode, it indicates the logical structure of the source 
program in the indentation pattern of its output. A number 
of options are available to give the user full control over 
the output format and to maximize its utility. [Ref. 19] 

2 • P retty print 

It takes as input a Pascal program and reformats the 
program according to a standard set of pretty printing 
rules. The pretty printing rules are given i.e., fixed. 
[Ref. 22] 

2 • P ascal Pro gram F or m att er 

Fermat is a flexible pretty printer for Pascal 
programs. It takes as input a syntacticaily-correct Pascal 
program and produces as output an equivalent but reformatted 
Pascal program. The resulting program consists of the same 
sequence of Pascal symbols and comments, but they are rear- 
ranged with respect to line boundaries and columns for 
readability. [Ref. 20] 
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of Format is 



The flexibility of Format is accomplished by 
allowing the user to supply various directives (opt ions) 
which override the default values. Rather than being a rigid 
pretty printer which decides how a program is to be 
formatted, the user has the ability to control hew format- 
ting is done, not cnly prior to execution but also during 
execution through the use of pretty printer directives 
embedded in the program. [Ref. 20] 



4. Contour 



It is a program whose purpose is to graphically 
illustrate a program's structure. It operates by bounding 
the scope of loops and conditionals by solid (or nearly 
solid) lines. When compound statements are embedded in ether 
compound statements, one obtains, rather than confusion, a 
rather pleasant display reminiscent of the contour lines of 
a topographical map. [Ref. 22] 



5 ♦ S vn tax -D ir ec ted Pret ty Pr int er 



It is a language independent pretty printer. It is 
divided into two phases: the grammar processing phase and 

the program processing phase. A language grammar for the 
specific language must be provided. It is much easier and 
quicker to write a grammar for a language than to code a 
new pretty printer for a specific language. It can work for 
all structured programming languages, and with minor modifi- 
cations, can work for other languages. It can handle such 
problems as comments and error recovery. [Ref. 14] 

6 . O thers 

The recent availability of low cost, high quality 
computer printers allows additional opportunities to improve 
readability and understanda bility . Important characters or 
words can be represented with different fonts: for instance. 
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the keywords can be represented by bold characters cr be 
underlined to be recognized more easily than other words. 
This can improve the readability of program. 

E. VARIOUS OBJECTIVES 

Although the final objective of all approaches is to 
improve the readability and understandabilty of the program, 
there are many secondary objectives. The following are 
typical examples of them: 

Teaching structure: An automatic system that checks 

structure and indentations can help beginning students learn 
good programming practice. A system that gives clear 
corrections to mistakes can provide a student with quick 
feedback. Such a system helps a student to learn structured 
programming and to learn a set of programming standards. 

Standardization in a programming organization: For 

large software projects with many programmers, program 
standardization is necessary to help in communication among 
programmers. 

Reformatting for maintenance: There are many programs 

that are very difficult to read. The maintenance process 

can be helped if programs can be transformed into a form 
that is familiar tc the maintenance programmers. The 
scoping capability of an extended pretty printer as 
described above can also help programmers understand 
programs they are correcting and modifying. 

Automatic corrections: An extended pretty printer can 

check the indentation of programs, correct indentation 
errors, and give the user messages explaining the errors. 
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Prom the above observations, several common parrs of the 
existing approaches can be found. First, most of the systems 
are for a specific programming language, for another 
programming language they would have to be written again. 
The one exception is the syntax directed pretty printer; for 
each new language it requires a grammar for each the 
language. Defining a correct grammar is not an easy task. 
Second, most of the systems try to make the pretty printer 
flexible, but the flexibility is limited to a few options 
and it is not easy to extend the requests. Most constructs 
cf the pretty printers are fixed, but the constructs them- 
selves can be changed e.g. extended or contracted. New 
structures for indentation can be generated. 
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IV. PROGRAM FAM ILY 



A. DEFINITION 

Program families are defined by Parnas [Ref. 13] as sets 
of programs whose common properties are so extensive that it 
is advantageous to study the common properties of the 
programs before analyzing individual members. Program fami- 
lies are analogous to the hardware families promulgated by 
several manufacturers. Although the various models in a 
hardware family might nor have a single component in common, 
almost everyone reads the common 'principles of oprations' 
manual before studying the special characteristics of a 
specific model [Ref. 13], 

B. DESIGN METHODOLOGY 

Parnas [Ref. 13] shows how module specifications define 
a family. This is an important guide for selecting the 
design method. Members of a family of programs defined by a 
set of module specifications can vary in three principal 
ways . 

1. Implementation methods used within the modules. 

Any combination of sets of programs which meet the 
module specifica tions is a member of the program family. 
Subfamilies may be defined either by dividing each of the 
main modules into submodules in alternative ways, or by 
using the method of structured programming to describe a 
family of i aple menta tions for the module. 
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2. Variation in the external parameters. 

The module specifications can be written in terms of 
parameters so that a family of specif ications results. 
Programs may differ in the values of those parameters and 
still be considered to be members of the program family. 

3. Use of subsets. 

In many situations one application will require only a 
subset of the functions provided by a system. We may 
consider programs which consist of a subset of the programs 
described by a set of module specifications to be members of 
a family as well. 

As discussed above, there are many primary and secondary 
objectives for a pretty printer. One approach to these 
various demands would be to design a large program with many 
options. Such an approach has several drawbacks: first, the 
resulting program would be large and necessarily complex, 
second, for each specific use of the program the unneeded 
options will most likely impose an unnecessary computational 
burden. The notion of a program family offers an alterna- 
tive design. A separate program will be written for 
different demands, however, all these programs will share a 
common design and many modules will be common to several 
family members. 

The concept of program families provides one way of 
considering program structure more objectively. For any 
precise description of a program family (either an incom- 
plete refinement of a program or a set of specification or a 
combination of both) one may ask which programs have been 
excluded and which still remain [Ref. 13]. The criteria of 
defining modules can be a way to select or distinguish seme 
design methodologies [Ref. 3]. 
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C. PROGRAMMING LANGUAGE FOB OBJECT OBIENTED DESIGN 

A design methodology alone is non sufficient to create 
computer solutions [ Bef . 3]. Some features of a programming 
language can also help in creating good software. In the 
following table, P. Wegner has categorized some of the 
most popular languages into generations, along with seme of 

TABLE I 

Prograaaing Language Generation Table 
Generation l an g uages Perio d 



1ST 


FORTRAN 


I, ALG0L58 


1954 - 1958 




2ND 


FORTRAN II, ALGOL60 
COEOL , LISP 


1959 - 1961 




3RD 


PI/I, ALGOL68, 
PASCAL 


1962 - 1970 




GAP 






1970 - 1980 




the language features they 


introduced : 






ADA was 


developed at 


the end of t 


he language gene 


ra- 


tion gap, and 


sc has been 


infl uenced 


by contemporary soft- 


ware methodologies. The 


following 


figures show 


the 


topologies cf 


each generation and ADA. 


ADA's topology 


is 


not flat like 


those of 


the previous generations. 


but 


rather is multi- dimensional 


[Ref. 3]. 
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Figure 4.1 Topology for 1st and 2nd Generation. 




Figure 4.2 Topology for 2st and 3nd Generation. 
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Figure 4.3 Topology of ADA. 

The following key features of ADA will support the 
tools for implementing the object oriented design 

[Ref. 23]. 

1. Programming in the large. 

Mechanisms for encapsulation, separate compilation, and 
library management are necessary for the writing of portable 
and maintainable programs of any size. 

2. Exception handling. 

large programs are rarely correct. It is necessary to 
provide a means whereby a program can be constructed in a 
layered and partitioned way so that the consequences of 
errors in one part can be contained. 

3. Data abstraction. 

Extra portability and maintainability can be obtained 
if the details of the representation of data can be kept 
separate frcm the specifications of the logical oprations on 
the data. 



29 



4 . 



Tasking. 

For many application it is important that the program 
te conceived as a series of parallel activities rather 
than just as a single sequence of actions. Building appro- 
priate facilities into a language rather than adding them 
via calls to an operating system gives better portability 
and reliability. 

5. Generic units. 

In many cases the logic of part a program is independent 
of the types of the values being manipulated. A mechanism 
is therefore necessary for the creation of related pieces of 
program from a single template. This is particularly useful 
for the creation of libraries. 
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7. MY SOLUTION 
A. PBOBLEM AND SOLUTION 

As shewn above, most traditional approaches to pretty 
printers are for a specific programming language. A recent 
development is the syntax directed pretty printer that can 
be used for different languages by providing a grammar of 
the language. The requirement to provide a language grammar 
represents a non-trivial task. Thera are many different 
secondary objectives for a pretty printer for different 
users. The functions of a traditional pretty printers are 
not enough to improve both the readability and understand- 
ability e.g. the program level construct documentation that 
traditional approaches do not support is needed to help to 
understand a given program. In short, there are many 

programming languages and many purposes, but there is not a 
system that satisfies all those requests and can be modified 
easily. 

In the previous section, the concept of a program family 
was discussed. The best way to solve the various demands and 
many programming languages is to construct a program family 
for the extended pretty printer. The char acteristics of 
program family will permit easy change, easy extension, and 
easy contraction. Each programming language will have a 
module for itself and data abstraction and procedural 
abstraction will be used to hide design decisions that will 
differ among the members of the program family. Data and 
procedural abstraction will also allow some modules to be 
used by all program family members. For example, the blank 
operations are a important data abstraction. These opratiens 
can be used for all programming languages and objectives. 
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B. GENERALIZED PROGRAMING LANGUAGE CONSTRUCT 



Fcr generalized indentation and level documentation, an 
general internal representation of program structure is 
required that is independent of any particular programming 
language. Let us call it a g e neral! zed f ormatter stuc ture . 
Since there are many programming language constructs in the 
many different programming languages, it is too difficult to 
define a perfect universal programming languge formatter 
construct. So, we define here a generalized programming 
language formatter construct that can cover cnly a limited 
number of programming languages - structured FORTRAN, PASCAL 
and seme ether structured programming languages. For simpl- 
icity, the detailed representation of a simple statement 
will be emitted. 

The structure of the program will be shown by indenting 
the constructs. First, the control structure will be 
considered. Dijkstra argued that control flow should be 
limited tc three basic structures - linear sequence, struc- 
tured selection, and structured iteration. But many program- 
mers use the following five structures - if, case, while, 
until, do for. Also the block can be a element of the struc- 
ture. Second, most program units are divided into two parts: 
a declarative part and imperative part. This is also impor- 
tant for the indentation. The Appendix A describes in detail 
the generalized format structures. 
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c 



ANALYSIS AND DESIGN 



1 . Ana lys is 

The extended pretty printer has two basic functions. 
The first is to reformat the source program e.g. 
indent, insert spaces and linefeeds between tokens and to 
decide where and how to break lines that are too long to fit 
on the output medium. The second is to produce level struc- 
ture documentation of the source program. The basic require- 
ment of the total system is that it has to be easy to 
change, easy to extend, easy to contract, e.g. it should be 
independent of the programming language and should be able 
to fulfill a variety of purposes. 

Every structured programming language can be repre- 
sented as English is - character, word, statement, compound 
stat ement (paragraph) , unit program (a paper). What is of 
interest is the way to represent these component as lines. 
The relationship of these components and lines is very 
important for the extended pretty printer. The following 
table represents the relationship of line and statement. 
The other components have some relation with the state- 
ments. Sc, every component can be represented by lines. 

Each level is represented by the source program 
structures. The structures are represented by statements. 
So, each statement can have a level degree. 

2 . D esign 

As noted in the section on program families, the 
most important aspect of this system design is to identify 
the objects. For the indentation, the line and statement are 
basic elements. Blank is ether important object. For the 
construct representation, level has to be a object. 
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TABLE II 

Relationship Table 



LINE | 

1 


STATEMENT 

1 


one | 


| one 


one | 


| many 


one | 


1 part 


one j 


I Dart and 

one/many 



("part” means part of a statement) 

The heavily dependent parts should be encapsulated 
in a module to allow for easy change. The indentation 
policy can be changed variously, it needs to be manipulated 
independently. To manipulate the input programming languages 
independently, the program should be a indepedent module. 
The program module needs some data structures - STACK, 
QUEUE -, Keywords table, and some statement oprations. The 
tiles - input source file and output file - and their format 
can be changed easily. So, the input/output files manipula- 
tions need be separated from other modules. 

For convenience, the module will be divided into two 
kinds. One is passive modules that are used by other modules 
but that do not use other modules, for example, blank, 
level, stack, queue and line. The other kind is a ctive 
m odu l es that use the other modules, for example, input, 
output, program and sc on. ADA will be used for the detailed 
design of the system. The following shows the detailed 
design. 
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Passive modules 



a . 



(1). St ack Module. This module provides seme 
stack opratiens. And it provides the following procedures 
for other modules that use them [Ref. 24]. 



generic type ITEM is private 
package STACK is 

type LIST is private; 
procedure CREATE (L: out LIST); 
procedure POSH(L: in out LIST; I: in ITEM); 
procedure POP(L: in out LIST); 
function TOP(L: LIST) return ITEM; 
underflow : EXCEPTION; 
private type NODE; 

type LIST is access NODE; 
type NODE is record 
head : ITEM; 
tail : LIST; 
end record; 
end STACK; 



( 2 ) . 

QUEUE oprations. 
for ether modules 



Mod ule ♦ This module provides some 
And it provides the following procedures 
that use them [Ref. 24]. 



generic type ITEM is private; 
package QUEUE is 

type LIST is private; 
procedure CREATE (L: out LIST); 

procedure ENQUEUE (L : in out LIST; I: in ITEM); 
— Insert the item into the rear of QUEUE 

procedure DEQUEUE (L ; in out LIST; I: out ITEM); 
— Delete the item from the front of QUEUE 

underflow : EXCEPTION; 
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private type NODE; 

type LIST is access NODE; 
type NODE is record 
head ; ITEM; 
tail : LIST; 
end record; 
end QUEUE; 

(3) . Blank Module . This nodule provides all 
blank operations that insert, remove, count and so on for 
other modules that need the blank operations. 



generic type INPUT is private; 
package ELANK is 

ELK ; constant CHARACTER := « »; 

type NUM is NATURAL; 

procedure I NSERT (N, M: in NUM; P: out INPUT); 

N : The start column of a line 
M : The number of blanks to be inserted 

procedure D ELET E (N, M : in NUM; P: out INPUT); 

N ; The start column of a line 
M ; The number of blanks to be deleted 

procedure START (L: in INPUT: N: our NUM) ; 

N : The number of blanks in a line 
from rhe start column 

function IS BLANK (C: in CHAR); return BOOLEAN; 
Check the input character is blank 
If blank, return TRUE 
Else, return FALSE 

overflow : EXCEPTION; 

end BLANK; 



(4) . Level Module. This module will provides 
the level operations for other modules that need them. The 
operations are: 



package LEVEL is 
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type NUM is NATURAL; 

procedure I NCR S ASS (L; i n our NUM) ; 

Increase the level 
L : input/cutput level number 

procedure DECRE ASE(L:in out NUM) ; 

Decrease the level 

procedure ZERO(L:in out NUM) ; 

Make the level zero or starting point. 

overflew : EXCEPTION; 

underflow : EXCEPTION; 

end LEVEL; 



(5) . Line Module. This module manages the line 
object. It provides a set of procedures available to ether 
modules that use the line. 



Generic type LINETYPE is private; 
package LINE is 

type LINEPOINT is private; 
type NUM is N AT URAL ; 
type CHAR is CHARACTER; 
procedure GET LINE 

7P: in out LITTEPCINT; L: out LINETYPE) ; 

Get a whole line into internal structure 
P : ID for a line 
L : Content of a line 

procedure PUT LINE 

(P : in out LITTEPCINT; L: in LINETYPE) ; 

Put the a internal line into the linetype 
P : ID for a line 
L : Content of a line 

procedure LINE LENGTH 

(P : in LINEPOITIT; N: out NUM); 

Compute the line length 
P : ID for a line 

procedure GET CHAR 

(P; in LINEPOINT; N: in NUM; out CHAR): 

Get a character that is in given line and 
pcsiton 

P : ID for a line 
procedure PUT CHAR 

(P: in LINEPOINT; N; in ITEM: in CHAR); 

Put the oiven character into the position 
and the line given 
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P : ID for a line 
N : Column of the line 



procedure FRONT INSERT 

(P: in out LINEPCINT; L; in LINETYPE) ; 
Insert line in front of the given 

line position 
P : ID for a line 
L : Content of a line 

procedure REAR INSERT 

IP: in out LINEPOINT; L: in LINETYPE) ; 
Insert the line at rear of 
the given line position 
P : ID for a line 
L : Content of a line 

underflow : EXCEPTION; 

overflow : EXCEPTION; 

private type NODE; 

type LINEPOINT is access NODE; 

type NODE is record 

content : LINETYPE; 

front : LINEPOINT; 

rear : LINEPOINT; 

end record; 

end LINE; 



(6) . S y mbol Table Module. This module 
manage a symbol table. It is designed for general 
manipulation. 



Generic type ITEMTYPE is private; 

package SYMBOLTABLZ is 

N : constant =: 200; — size of symbol table 

ITEMSIZE: constant =: 20; 

type ITEM is new STRING (1 .. ITEMSIZE) ; 

procedure A DD ( X : in ITEM; I: in ITEMTYPE) ; 

Insert an item and the information 
associated with it into SYM30LTABLS 

function IN TAB LE (X : in ITEM) return BOOLEAN; 

-- Check to see if an item is in 
the SYMBOLTABLE 

function GET (X: in ITEM) return ITEMTYPE; 



will 
s ym bo 1 
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Retrieve the information associated 
with an item in the SYMBOLTABLE 

function FULL return BOOLEAN; 

Determine whether or not the SYMBOLTABLE 
is full 

procedure CLEAR; — empty table 

Reinitialize (reset) the SYMBOLTABLE 



end SYMBOLTABLE ; 



fc. Active mcdules 

(1). Input Module. This module hides the input 
format. It reads the original lines from the input media 
and calls procedures provided by the line module to store 
the lines inside of the line object. 



with TEXT_IC; 
with LINE; 

generic type LINEPOINT is private; 
package INPUT is 



type INFILETYPE : TEXT_IO.FILS_TYPE; 
procedure READFILE 

(INFILZ: in INFILETYPE: START : out LINEPOINT) ; 

Read the input file and store each line into 
internal line structure using LINE module 
INFILE ; The input file that have source orcgram 
START ; The starting line ID of internal* 
structure 



end INPUT 



(2) . 0 uf put. Module. This module will hide the 

outfila media. And it will output the indented results, the 
construct form of the input program and the input using 
other modules - indent, line and so on. 

with TEXT_IC ; 
with LINE; 
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is private; 



with INDENT; 
generic type LINETYPE 
package OUTPUT is 

type OUTFILETYPE : TE XT_IO . FILEJTYPE ; 
type CCDEFILETYFE : TE XT_IO . FIL E_T YPE ; 
procedure PRINT CUTFILE 

(OUTFILE: out IUFILETYPE; START : in LINEPOINT) ; 
Print the indented output into OUTFILE using 
indent and line modules 
OUTFILE : The output file that has 

the indented source program 
START : line start ID of internal structure 

procedure PRINT CODEFILE 

(CODEFILE: cut INFILETYPE; START : in LINEPOINT); 

Print the code documentation using line and 
indent module 

CODEFILE : The output file that has 
the code documentation 

START : Line start ID of internal structure 

end OUTPUT 



(3) . St atemen t Modu le. This module manages the 
statement object and also provide a set of procedures avai- 
lable to other modules that use the statement object by 
using line module procedures. 



With LINE; 



generic 



type INDENTPOINT is private; 
package STATEMENT is 

type NUM is NATURAL; 

type CHAR is CHARACTER; 

type INDENTPOINT is access INDENTNODE; 

type ST AT E POINT is access NODE; 

type NODE is record 

content : STATETYPE; 

front : STATEPOINT; 

rear : STATEPOINT; 

end record; 
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type STATETYPE is record 

from : POSITION; 

tc : POSITION; 

information : INDENTPOINT; 

end record; 

type POSITION is record 

line : LINEPCINT; 

column : NUM; 

end record; 

procedure GET_S TATE DELIMITOR (D : in CHAR); 

Get statement delimitor 

function END OF_STATE(D: CHAR) return BOOLEAN 
Check tne end of a statement 

procedure GET STATE 

(P: in out STXTEPOINT; L: out STATETYPE) ; 

Get a statement using LINE module 

procedure POT STATE 

(P ; in out STlTEPOINT; L: in STATETYPE); 

Put a statement using LINE module 

procedure STATE LENGTH 

(P: in STATEPOIUT: N: out NUM) ; 

Compute the length of a given statement 

procedure RECOGNIZE STATEMENT 
(?: in out ST AT EPOITTT ; L: in LINEPOINT) ; 
Recognize the statement from 
the internal line structure 

procedure GET CHAR 

(P: in STATEPOINT; N: in NUM: out CHAR); 

Get a character from the given statement 
and column 

procedure PUT CHAR 

(P: in STATEPOINT; N: in NUM: in CHAR); 

Put a character into the aiven statement 
and column 

procedure FRONT INSERT 

<P: in out STATTPOINT; L: in STATETYPE); 
Insert the given statement into 
front of the given statement ID 

procedure REAR INSERT 

(P: in out STATEPOINT; L: in STATETYPE); 
Insert the given statement into 
rear of the given statement ID 

underflow ; EXCEPTION; 

overflow : EXCEPTION; 

end STATEMENT; 
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(4) • Indent Module. This module will indent 
each line using the line module, statement module and Hank 
module. And the indentation policy can be decided here e.g. 
the size cf each level, the treatment of blanks, and so on. 



with BLANK ; 
with STATEMENT; 
with LINE; 

generic type POLICYTYPE is private; 

type CCNSTRUCTTYPE is private; 
package INDENT is 

type INDENTPOINT is access INDENTNODE; 
type INDENTNODE is record 

level : NUM; 



cons treat : CONST ROC TTY PE; 
end record; 
procedure INDENT 

(P: in STATEPOINT; L: out LINETYPE) ; 

Indent a line i. e . rnsert or delete blanks 
and make line break according to the source 
program syntex using the information about 
level and construct type and so on 



procedure GET POLICY 
(P: in POLICYTYPE) ; 

Get the indentation and objective policies 
for example, each level has 3 blanks 
and with indentation error messages. 



crccedure POT POLICY 
(P; cut POLICYTYPE) ; 

Put the indentation and objective policies 

procedure GET INFORMATION 

(P: in STATEPOINT; L: out STATETYPE) ; 

Get the information for indentation 
and level documentation 



procedure PUT INFORMATION 

(P; in STATEPOINT; L: in STATETYPE) ; 

Put the information for indentation 
and level dccumen tat ion 



end INDENT; 



(5) • Prc gra m Mo dule . This module will hide the 
program characteristics. It should be highly dependent on 
each programming language. It have two procedures - scanner 
and parser. 

with LINE; 
with STATEMENT; 
with ELANK; 
with SYMECLTABLE; 
with LEVEL; 
with STACK; 
with QUEUE; 

package FEOGRAMPART is 
procedure SCANNER 

(P: in out STATEPOINT; L: out ITEMTYPE) ; 

Scan the source program and recognize 
each statement type for parser 

procedure PARSER; 

Recognize the construct of the source 
program 

end FBOGEAMFART ; 

(6) . Master Module. This module will control 
all above modules. 

with PROGBAMPART; 
with INDENT; 
with INPOT; 
with OUTPUT; 

procedure MASTER; 

— Control all the module for reformatting 
and level structure documentation 
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Figure 5. 1 Module Interface. 

The above figure explain the interfaces of 
each module. The arrow direction indicates using module. 

D. EXAMPLE (FORTRAN) 

1 . Sta nda rd Form 

There have been many attempts to standardize the 
FORTRAN programming language. Here# the standard form will 
follow the concept of COMPATIBLE FORTRAN [Ref. 1]. The 
following represent the rough standard form. 
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a. Basic Components 

It consists cf four elements - character set, 
symbolic names, constants and array elements. 

fc. Statements 



0 ) • Statement Co mpc nsn ts. Statements are made 
up of such components as labels, keywords, symbolic names, 
constants and special characters. For Compatible FORTRAN, a 
stricter rule should be observed: (1). Statement labels, 

keywords, symbolic names, integer constants should not have 
embedded blanks, except for key words GO TO, DOUELE 

PRECISION and BLOCK CATA, which may have blanks in the posi- 
tions shewn. (2) . Where two alphabetic or numeric state- 
ment components ccme together with no other special 
characters between them, a blank should be inserted. Example 
are: 



00151=1,10 

REWIND J should be written 

REALAAA 



DO 15 1=1,15 
REWIND J 
REAL AAA 



(3) Keywords, labels, symbolic names or constants should not 
be split between two lines. 



(2) . END Line. END is not considered a state- 
ment but is a type of line. It may not be labelled, executed 
or continued. Note especially that END is not an executable 
statement with the same effect as RETURN in a subrpregram or 
STOP in the main program. 

(3) . F o rmat pf Stateme nts. The Standard limits 
each statement to cne initial line and not more than 3 
continuation lines. 



(4) . Order of Stat ement s . The following table 
show the order of statements. By 'header statement' is meant 
a SUBROUTINE, FUNCTION or BLOCK DATA statememt. Horizontal 
lines within the table indicate that entities above the 
line must precede entities below the line (if present) . 
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TABLE III 

Table of Statement Order 



Header Statement 


Type Statements 


Comment 

Lines 


DIMENSION Statements | EXTERNAL 

COMMON Statements | 


EQUIVALENCE Statements 


DATA Statements 


Statement Functions 


Executable Statements 


STOP Line 


FORMAT Statements 


END line 



Vertical lines indicate that the entities on either side of 
the line may be intermingled [Bef. 1], 

c. Specification Statements 

Specification statements are non-executable 
statements which give information no the compiler. It 
consists Of Tf PE (DODELE PRECISION, INTEGER, REAL, LOGICAL, 
and COMPLEX), DIMENSION, COMMON, DATA and EQUIVALENCE. 

d. Transfer of Control 

This consists of the GO TO statement. Computed 
GO TO statement, RETURN and STOP statements, Arithemetic IF 
statement. Logical IF statement, DO statement, and CONTINUE 
statement. 
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e. Input/Output 

This consists of the WRITE statement, READ 
statement, ENDFILE statement, REWIND statement, EACKSPACE 
statement and FORMAT statement. 

f. Expression and Assignment 

This consists of the Arithemetic Expression, 
Logical expression, and Assignment statement. 

g. Program Units 

This consists of the Main program. Function 
subprograms. Block Data, and Subroutine subprograms. 

2 • S tr uctured F ctm 

The algorithm language [Ref. 18] is convenient for 
representing the generalized construct structure. So, to 
represent the structured FORTRAN form, it will be compared 
with the algorithm language. Detail structured forms are as 
follows : 

ALGORITHM LANGUAGE FORTRAN IV 

1. ALGORITHM 

ALGORITHM a Igor i th in name same_vith 'C' in column 1 

statement s 
END algcrithm_na me 

2. IF_THEN_single statement 

IF condition THEN IF (condition) statement 

statement 
END IF 

3. IF_THEN_multiple statements 

IF condition THEN IF (.NOT. condition) GO TC 10 

statements statements 

END IF 10 CONTINUE 

4. IF_T EEN_ELS E construct 

IF (.NOT. condition) GO TO 5 
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IF condition THEN 



statements 1 

ELSE 

statements 2 
END IF 



statements 1 
GO TO 6 

5 CONTINUE 

statement 2 

6 CONTINUE 



5. Multiway selection : ELSE IF 



IF continue_1 THEN 
statements 1 

ELSE IF conditiSn 2 TEEN 
statements 2 

ELSE I? condition 3 THEN 
statements 3“ 

ELSE 

statements 4 
END IF 

(ELSE is optional) 





IF 

GO 


(.NOT. condition 1) 
statements 1 
TO 2 0 


GO 


TO 


10 


10 


IF 

GO 


(.NOT. condition 2) 
statements 2 
TO 2 0 


GO 


TC 


1 1 


11 


IF 


(.NOT. condition 3) 
statements 3 


GO 


TC 


12 



GO TO 20 
12 CONTINUE 

statements 4 
20 CONTINUE 



6. WHILE repetion 

WHILE condition DO 
statement s 
END WHILE 



5 IF (.NOT. condition) GO TC 6 

statements 
GO TO 5 

6 CONTINUE 



7. REPEAT repetition 

REPEAT 

statement s 
UNTIL condition 



8. DO FOR repetition 

FOR I <- L TO H BY N DO 
statement s 
END FOR 10 

(BY N can be omitted, in 
which case BY 1 is assumed) 



5 CONTINUE 

statements 

IF (.NOT. condition) GO TC 5 



DO 10 I = L,M,N 
statements 
CONTINUE 

(, N can be omitted in which case 
, 1 is assumed) 



9. Multiway selection _ CASE 



CASE variable OF 
1 : 

statements 1 

2 : 

statements 2 
3: 

statements 3 

ELSE 

statements 4 
ELSE CASE 

(ELSE is optional) 



IF (variable. LT.1) GO TO 20 
IF (variable. GT. 3 GO TO 20 
GO TO (11,12,13), variable 

11 CONTINUE 

statements 1 
GO TO 30 

12 CONTINUE 

statements 2 
GO TO 30 

13 CONTINUE 

statements 3 
20 CONTINUE 

statements 4 
30 CONTINUE 



10. FUNCTION 

FUNCTION function na me (parm_1 , . . . , parm n) 
statements ~ 
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f uncticn_ na me expression 
END f unction_name “ 

data type FUNCTION f unction_name (parm_1 , . . . , parm_n) 
statements 

function name = expression 
RETURN ~ 

END 



11. PROCEDURE ( SUBRCCTINE) 

PROCEDURE procedure r.ame(parm 1,..., parm_n) 
statement s 
END procsdure_name 

SUBROUTINE sub rou tine_name ( par m_ 1 , . . . , parm_n) 
statements 
RETURN 

END 



3 . F ormat Grammar 

This grammar represents the construct format of 
structured FORTRAN. It is a subset of the generalized format 
structure. The control structure is limited to 5 structures 
- if, case, while, until, and do. In the declaration part, 
the declarations will be statements. For more detail, the 
grammer figures (Appendix B) can be referenced. 

4 • I m p lem e nta ticn 
a. Limitations 

An ADA compiler was net available for this work. 
So, the PASCAL programming language was used to implement 
the system. This implementation is a little different from 
the design of the previous section because PASCAL does not 
support all the ADA programming features. In order to simply 
the implementation, just a subset of the system was imple- 
mented, i.e. the UNTIL construct is omitted. 

Also the implemented system does not cover all 
standard FORTRAN - it dees not include some keywords like 
PAUSE, REWIND and so on. The other limitations of this are 



the following: 1. All input programs should be syntatically 

correct to get proper indentation and the level documenta- 
tion. 2. All input FORTRAN programs should be conform to 
the standard structured form mentioned in previous sections. 
3. The input lines should be short enough to indent without 
being extended onto the next line. That is the implemented 
system does not have the line break function. 

fc. Internal Data Structure 

0) • Line D ata Structu re . The input line and 
output line are represented as an array of characters. 
Normally, programming langugages use 80 column per line. In 
actual programs, most lines do not use all of the columns; 
the mean cf programming line size is 34 [Ref. 2]. If the 
maximum array is assigned for one line, space is wasted. So 
to save memory and make the line flexible, a double linked 
data structure was used for the internal line structure. 
Also, a sentinel node will be used. It allows an easy check 
of an empty input file. 

(2) . Statement Data St ructure . As shown above, 
the relationship of line and statement is one to one or many 
to one. Clearly, the statement can be represented by the 
line data structure. So, a line record will have information 
about statements. Comment statements will be ignored for 
statement representation. 

(3) . C onstruc t D ata Struct u re . The construct 
will have seme relationship with the statements e.g. one to 
one for simple statements, one to many for others. The 
statements can have the information of the construct, since 
every construct can be seperated into statements. For 
example, the DO construct consist of D0_C0ND statement, 
compound statement and 2ND_D0 statement. But here, the line 
also will have the construct information. It is possible 
since the relationship of line and statement also one to one 
and many to one. 
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c. The Program and Example Inoui/Output 



An y o ne 

program should conta 
Science Department 
Monterey, Californ 
referenced in Append 
any meaning. It is 
programs and the res 



interested in obtaining a copy of the 
ct the author directly or the Computer 
at the Naval Postgraduate School, 
ia. The example input output can be 

ix C. The example program does not have 
written just to show the constructs of 
ults of program execution. 
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VI. CON CLU SION 



One of today's software problems is the very high cost 
of developing and maintaining software. Much research has 

been devoted to solving this problem. One way to solve 

today's software crisis is to study software tools that can 
help people who serve in the software area. 

This thesis designed and partially implemented a program 
family of extended pretty printers that can help to solve 
software problems by improving readability and understand- 
ability cf programs. 

The system will work for almost any structured program- 
ming language and for various secondary functions with only 
small changes in some modules. The design presented here is 
for a program family cf pretty printers. The program imple- 
mented here is one member in this family. Other members of 
the program family remain to be implemented. 
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APPENDIX A 

GENERALIZED CONSTRUCT FLOW CHART 




Figure A.1 Program Structure. 
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Figure A. 2 Declaration. 




Figure A. 3 Subprocedure. 




Figure A. 4 Main Procedure. 
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I 

Figure A. 5 Compound Statement. 




Figure A. 6 If statement. 
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Figure A. 7 Case Statement. 
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WHILE COHO 



M > COMrST 



END WILE 



M3 



Figure a. 8 ihile Statement. 
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Figure A. 9 


Dntil Statement. 





| 0 00 COND } £, 
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COMP ST 


ENDDO } >0 i 
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Figure A. 10 


Do Statement. 


1 1 
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0 BEGIN t> 


COMP ST 


ENDBEGIN } *Q \ 
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Figure A. 11 Block Statement. 
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APPENDIX B 

STEOCTURED FORTRAN FOBHAT CONSTRUCT FLOW CHART 




Figure E. 1 Program structure. 



MAIN 



-to 



o SUB_HEAD y ~ 




Figure B. 2 Subroutine. 



O [-^DECLARATION^- 




COMPST 



-»{ ENS_ST ) tO 



1 



Figure B.3 Main. 
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V. 



V. 



SIMPLE 'y 



IF 



CASE 



> WHILE 



UNTIL 



DO 






j 



j 



>o 



J 



Figure B.4 Compound statement. 




Figure B.5 If Statement. 
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Figure B. 7 While Statement. 




Figure B. 8 Until Statement. 



O' 1 - DO_COND ^ ► 



COMP ST 



->( CONTINUE ) KD 



Figure B. 9 Do Statement. 
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Figure B. 10 Case_Cond, 

i 1 

O > ^GO_LABEL J < label_if D *> 

Figure B.11 Go_If. 




Figure B.12 Go_Cont. 
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Figure B.13 State Chart 1. 
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•CONTINUE' 
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Figure B.14 state Chart 2. 
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Figure B.15 State Chart 3. 
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Figure B.16 Continuation of State Chart 3. 
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Figure B.17 Continuation of State Chart 3. 
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APPENDIX C 

EXAMPLE OP INPUT AND OUTPUT 



*** INPUT NOT INDENTED * ** 



c*** 

c** * 

c** * 
c*** 
c 
c 

c*** 

c 

c*** 

c 

c#* * 

c 



c 

c*** 

c 

c 

c 

c 

c 

c 

c 

c 



***** *************** *************** ********** 

************ 

TEST PROGRAM FCE THE AUTOMATIC ************ 
INDENTATION PROGRAM ************ 

************ 

* ********* * *** ****** *** ********************** 



********************************************* 
MAIN PROGRAM 

* ** * ****** * * *********** ****************** **** 



100 



DECLARATION 

REAL E 1, R 2 ,R3 , RE (2 0) 
INTEGER II ,I2,I3,ID(20) 
LOGICAL L 1 ,L2, L3 

CCKECUND STATEMENT 

SIMPLE STATEMENT 
IF STATEMENT 
CASE STATEMENT 
WHILE STATEMNET 
CO STATEMENT 



II ,12,13 



READ (5,100) 
FCFMAT (315 
LI = .TRUE. 

R 1 = 1.5 
1+5. 0-6.7 
2+4.8 

CALL SU3(RD,ID) 



IF (.NOT. (II 
DO 500 I = 1 
R (I) = 0.0 
500 CONTINUE 
GO TO 444 

1 IF (.NOT. (12. 
1 1 1 IF ( . NCT. LI) 

11 = 11+1 
GC TO 11 1 
11 CONTINUE 
GO TO 444 

2 IF (I3.NE.3) 
12 = 12+1 

GO TO 44 4 

3 CCNTINUE 
13=13+1 

444 CCNTINUE 
R (1 1) =5. 5 
IF (II .LT. 5) 
IF (II .GT. 3 



NS'. 1) ) 
20 



GO TO 1 



NE .2 
GO 



U 



GO 

1 1 



TO 2 



GO TO 3 



GO 

GO 



TO 

TO 



555 

555 



66 



GO TO 
CONTIN 



5,6,7) 



RD (1) = 
RD 2 = 
GO TO 



& 



567 

555 



= 5. 0 
5. 0 
666 
CONTINUE 
RD (1) =6. 0 
RD (2 =6. 0 
GC TO 66 6 
CONTINUE 
DO 567 1=1,19 
RD (I) = FL 0 A T ( I ) 
CONTINUE 
RD (20) =40. 0 
CONTINUE 
R 1 = 4. 9 
11=4*12 
STOP 
END 



C 

c 

c 

c 

C*************** ** ********* *************** * ** ** ** 
C SUBROUTINE PROGRAM 

C** ************* ** ********* ********************** 
SUBROUTINE SUB (RD, ID) 

C 

C*** DECLARATION 

c 

REAL R1, R2,R3, RE (20) 

INTEGER II ,12,13,10(20) 

LOGICAL LI ,L2, L3 
C 

c *** SIMPLE STATEMENT 
C 

READ (5,10 0) 11,12,13 
100 FORMAT (31 5 ) 

LI = .TRUE. 

R 1 = 1.5 
C 

c *** IF STATEMENT 
C*** CO STATEMENT 
C 

IF (.NOT. (I 1. NE. 1) ) GO TO 1 
DO 500 I = 1,20 
R (I) =0.0 
1 CONTINUE 
RETURN 
END 



*** END OF INPUT *** 
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*** PROGRAM CONSTRUCT *** 



DECLARATION 

DECLARATION 

DECLARATION 

SIMPLE 

SIMPLE 

SIMPLE 

SIMPLE 

SIMPLE 

IF (COND) THEN 
DO (COND) 

SIMPLE 
END DO 

ELSE IF (COND) 

WHILE (COND) DO 
SIMPLE 
END WHILE 
ELSE IF (COND) 

SIMPLE 

ELSE 

SIMPLE 
END IF 
SIMPLE 
CASE VAR 
CONST : 

SIMPLE 
SIMPLE 
CONST : 

SIMPLE 
SIMPLE 
CONST : 

DO (COND) 

SIMPLE 
END DO 
SIMPLE 
END CASE 
SIMPLE 
SIMPLE 

STOF 

END OF PROGRAM 
SUEECUTINE 
DECLARATION 
DECLARATION 
DECLARATION 
SIMPLE 
SIMPLE 
SIMPLE 
SIMPLE 

IF (COND) THEN 
DO (COND) 

SIMPLE 
END DO 

RETURN 

END OF PROGRAM 



*** END OF CONSTRUCT *** 
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*** OUTPUT INDENTED *** 



C444 4 

c* * * 

C*4 4 

c** * 

C44 4 
c** 44 

c 

c 

c 

c *44 * 

c 

c** * 

c 



c 

c*** 

c 

c 

c 

c 

c 

c 

c 

c 



$$** 444 :*** 4*4 4444***44 44 * 4*4444444 44 4 44 4 44*4 

44 *444444444 

TEST PROGRAM FOB THE AUTOMATIC ************ 
INDENTATION PROGRAM ************ 

****** ** *** * 

****** *** * *** ********* ****************** ** * * 



*********** ******** *** ********************** 
MAIN PROGRAM 

4444*4 4 44 44 44 44 4444444 4 44 4444 44 4 444444 44 44 44 



DECLARATION 

REAL R1,R2,R3,RD(20) 
INTEGER 11,12,13,10(20) 
LOGICAL LI ,L2, L3 

COMPOUND STATEMENT 

SIMPLE STATEMENT 
IF STATEMENT 
CASE STATEMENT 
WHILE STATEMNET 
DO STATEMENT 



GO 

20 



READ (5, 100) II, 12, 13 
100 F0RMAT(3I5) 

LI = .TRUE. 

R 1 = 1.5 

1 +5.0-6.7 

2 +4. 8 

CALL SUB (RE, ID) 

IF( .NOT. ill. NE. 1 ) ) 

DO 500 I = 1 
R (I) =0 .0 

500 CONTINUE 

GO TO 444 

1 IF(.NCT. (I2.NE.2) ) GO 

111 IF (.NOT. LI) GO 

11 = 11+1 
GO TC 111 

11 CONTINUE 

GO TO 444 

2 IF ( 13 .NE. 3) GO TO 3 

12 = 12+1 
GO TO 444 

3 CONTINUE 

13=13+1 

444 CONTINUE 

R (II) =5. 5 
IF(I1.LT.5) GO 
IF ( 1 1 .GT. 3) GO 
GO TO (5 ,6,7) 

5 CONTINUE 

RD ( 1 ) =5 . 0 
RD ]2 =5. 0 
GO TO 666 

6 CONTINUE 



TO 1 



TO 2 
TO 1 1 



TO 

TO 



555 
5 55 



RD ( 1 ) =6 . 
RD (2) =6. 



GO TO 666 
CONTINUE 



6 9 



DO 567 1=1 , 19 

HD (I) = FLO AT (I) 

567 CONTINUE 

RD (20) =40. 0 
555 CONTINUE 

H 1= 4 . 9 
11=4*12 

STOP 

END 

C 

C 

C 

C 

C*** ************ *********** ************ ********** 
C SUBROUTINE PROGRAM 

C**** ************************************** ****** 

SUBROUTINE SUB (ED, ID) 

C 

c *** DECLARATION 
C 

REAL R 1, R 2 ,R3 , RD (2 0) 

INTEGER II ,12, I3,ID(2 0) 

LOGICAL LI ,L2 , 13 
C 

c *** SIMPLE STATEMENT 
C 

READ (5, 100) 11,12,13 
100 FORM AT (3 1 5) 

LI = .TRUE. 

R 1 = 1.5 
C 

c*** IF STATEMENT 
c *** D0 STATEMENT 
C 

IF(.NOT. (I1.NE. 1 )) GO TO 1 
DO 5 00 I = 1.20 
R (I ) =0 .0 

1 CONTINUE 

RETURN 
END 



*** END OF OUTPUT *** 
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